{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Text-guided depth-to-image generation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The [StableDiffusionDepth2ImgPipeline](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/depth2img#diffusers.StableDiffusionDepth2ImgPipeline) lets you pass a text prompt and an initial image to condition the generation of new images. In addition, you can also pass a `depth_map` to preserve the image structure. If no `depth_map` is provided, the pipeline automatically predicts the depth via an integrated [depth-estimation model](https://github.com/isl-org/MiDaS).\n",
    "\n",
    "Start by creating an instance of the [StableDiffusionDepth2ImgPipeline](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/depth2img#diffusers.StableDiffusionDepth2ImgPipeline):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from diffusers import StableDiffusionDepth2ImgPipeline\n",
    "from diffusers.utils import load_image, make_image_grid\n",
    "\n",
    "pipeline = StableDiffusionDepth2ImgPipeline.from_pretrained(\n",
    "    \"stabilityai/stable-diffusion-2-depth\",\n",
    "    torch_dtype=torch.float16,\n",
    "    use_safetensors=True,\n",
    ").to(\"cuda\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now pass your prompt to the pipeline. You can also pass a `negative_prompt` to prevent certain words from guiding how an image is generated:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "url = \"http://images.cocodataset.org/val2017/000000039769.jpg\"\n",
    "init_image = load_image(url)\n",
    "prompt = \"two tigers\"\n",
    "negative_prompt = \"bad, deformed, ugly, bad anatomy\"\n",
    "image = pipeline(prompt=prompt, image=init_image, negative_prompt=negative_prompt, strength=0.7).images[0]\n",
    "make_image_grid([init_image, image], rows=1, cols=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "| Input                                                                           | Output                                                                                                                                |\n",
    "|---------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------|\n",
    "| <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/coco-cats.png\" width=\"500\"/> | <img src=\"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/depth2img-tigers.png\" width=\"500\"/> |"
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 4
}
